Local genetic correlations exist among neurodegenerative and neuropsychiatric diseases

Genetic correlation (\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg) between traits can offer valuable insight into underlying shared biological mechanisms. Neurodegenerative diseases overlap neuropathologically and often manifest comorbid neuropsychiatric symptoms. However, global \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg analyses show minimal \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg among neurodegenerative and neuropsychiatric diseases. Importantly, local \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg s can exist in the absence of global relationships. To investigate this possibility, we applied LAVA, a tool for local \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg analysis, to genome-wide association studies of 3 neurodegenerative diseases (Alzheimer’s disease, Lewy body dementia and Parkinson’s disease) and 3 neuropsychiatric disorders (bipolar disorder, major depressive disorder and schizophrenia). We identified several local \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg s missed in global analyses, including between (i) all 3 neurodegenerative diseases and schizophrenia and (ii) Alzheimer’s and Parkinson’s disease. For those local \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$r_g$$\end{document}rg s identified in genomic regions containing disease-implicated genes, such as SNCA, CLU and APOE, incorporation of expression quantitative trait loci identified genes that may drive genetic overlaps between diseases. Collectively, we demonstrate that complex genetic relationships exist among neurodegenerative and neuropsychiatric diseases, highlighting putative pleiotropic genomic regions and genes. These findings imply sharing of pathogenic processes and the potential existence of common therapeutic targets.


INTRODUCTION
Neurodegenerative diseases are a group of syndromically-defined disorders that are characterised by the progressive loss of the structure and function of the central nervous system. They are typically grouped by their predominant neuropathological protein deposit (e.g. synucleinopathies, like Parkinson's disease and Lewy body dementia, by α-synuclein deposition and Alzheimer's disease by deposition of amyloid), but more often than not, they present with co-pathologies, suggesting that they might share common pathogenic pathways 1,2 . This notion is supported by genome-wide association studies (GWASs), which have (i) identified shared risk loci across neurodegenerative diseases, such as APOE and BIN1 in Alzheimer's disease (AD) and Lewy body dementia (LBD), or GBA, SNCA, TMEM175 in Parkinson's disease (PD) and LBD and (ii) demonstrated that genetic risk scores derived from one neurodegenerative disease can predict risk of another, as with AD and PD scores predicting risk of LBD [3][4][5] . The importance of identifying common pathogenic processes cannot be overstated, given the implications for our mechanistic understanding of these diseases as well as identification of common therapeutic targets benefitting a wider range of patients.
From a clinical perspective, neurodegenerative diseases are often also defined in terms of their predominant symptom (e.g. AD by memory impairment or PD by parkinsonism), but in reality, present as highly heterogenous diseases, with symptoms spanning multiple domains including neuropsychiatric symptoms 6,7 .
Indeed, a higher prevalence of depression has been observed in individuals with dementia compared to those without dementia 8 . Furthermore, depression and anxiety are more common in individuals with PD compared to the general population, with clinically significant symptoms in 30-35% of patients 9,10 . A similar (albeit reversed) phenomenon has been observed in some neuropsychiatric disorders, with a higher risk of dementia diagnoses observed in individuals with schizophrenia (SCZ) versus individuals without a history of serious mental illness 11,12 and a higher risk of PD in individuals diagnosed with depressive disorder in mid or late life 10,13,14 . Together, these observations suggest the possibility of intersecting pathways between neurodegenerative and neuropsychiatric diseases.
Given these clinical and neuropathological overlaps, genetic overlaps would also be expected. However, a study of global genetic correlation between neurological phenotypes demonstrated limited overlap between individual neurodegenerative diseases as well as between neurodegenerative diseases and neuropsychiatric disorders 15,16 . Genetic correlation (r g ) is a frequently used measure of genetic overlap, which is traditionally studied on a genome-wide scale, and thus, represents an average of the shared genetic effects across all causal loci in the genome 17 . This global approach may not capture shared genetic effects that are confined to particular regions of the genome (i.e. local r g s) or local r g s that have opposing directions across the genome 15,17 . Indeed, local r g s have been observed between neuropsychiatric 1 traits 18 and between AD and PD (specifically in the HLA 19 and MAPT loci 20 ). In addition, previous work using approaches that can detect polygenic overlap (including overlaps where there are mixed patterns of allelic effect directions) have demonstrated a global polygenic overlap between neurodegenerative and neuropsychiatric diseases, such as AD and bipolar disorder (BIP) 21 , AD and major depressive disorder (MDD) 22,23 , and PD and SCZ 24 . Collectively, these studies indicate that local genetic overlaps likely exist between neurodegenerative and neuropsychiatric diseases.
Here, we assess local r g between 3 neurogenerative diseases (AD, LBD and PD) and 3 neuropsychiatric disorders (BIP, MDD and SCZ). All 6 disease traits represent globally prevalent diseases 25 , have reasonably large GWAS cohorts 3,5,[26][27][28][29][30] , and importantly, have demonstrated evidence of a potential genetic overlap (but have not, to our knowledge, been systematically assessed all together for local r g ). To estimate local r g from GWAS summary statistics, we used the recently developed tool local analysis of [co] variant association (LAVA) 31 . Unlike existing tools, such as rho-HESS 32 and SUPERGNOVA 33 , which only permit testing of local r g s between two traits, LAVA is additionally able to model local genetic relations using more than two traits simultaneously, thus permitting exploration of local conditional genetic relations between multiple traits (a particularly useful feature in the context of neurodegenerative diseases like LBD, which has been hypothesised to lie on a disease continuum between AD and PD 5,34 ). In addition, we use data from blood-and brain-derived gene expression traits, in the form of expression quantitative loci (eQTLs), to facilitate functional interpretation of local r g s between disease traits.

Local analyses reveal genetic correlations among neurodegenerative and neuropsychiatric diseases
We applied LAVA to 3 neurodegenerative diseases (AD, LBD and PD) and 3 neuropsychiatric disorders (BIP, MDD and SCZ) ( Table  1), all of which represent globally prevalent diseases 25 . Among neurodegenerative diseases, AD and PD are the most common, with a global prevalence of 8.98% and 1.12% in individuals >70 years of age 6,7,25 and consequently, have large GWAS cohorts (AD, N cases = 71,880; PD, N cases = 33,674) 3,26 . LBD is the second most common dementia subtype after AD, affecting between 4.2 and 30% of dementia patients 35 . As such, the LBD GWAS cohort is small (N cases = 2591), but unlike AD and PD neurodegenerative GWASs, 69% of the cohort is pathologically defined 5 . Among neuropsychiatric disorders, MDD is the second most prevalent, with an estimated 185 million people affected globally (equivalent to 2.49% of the general population), while BIP and SCZ have a prevalence of 0.53% and 0.32%, respectively 25 . Accordingly, all 3 disorders have large, well-powered GWASs (BIP, N cases = 41,917; MDD, N cases = 170,756; SCZ, N cases = 40,675) [28][29][30] .
We tested pairwise local genetic correlations (r g s) across a targeted subset of 300 local autosomal genomic regions that contain genome-wide significant GWAS loci from at least one trait ( Supplementary Fig. 1, Supplementary Table 1). These genomic regions, henceforth referred to as linkage disequilibrium (LD) blocks, were filtered from the original 2,495 LD blocks generated by Werme et al. 31 using a genome-wide partitioning algorithm that aims to reduce LD between LD blocks.
First, we performed a univariate test for every disease trait at each of the 300 LD blocks to ensure sufficient local genetic signal was present to proceed with bivariate local r g analyses. Pairs of traits exhibiting a univariate local genetic signal of p < 0.05/300 were carried forward to bivariate tests, resulting in 1603 bivariate tests across 275 distinct LD blocks. Using a Bonferroni-corrected p value threshold of p < 0.05/1603, we detected 77 significant bivariate local r g s across 59 distinct LD blocks, with 25 local r g s between trait pairs where no significant global r g was observed ( Fig. 1a, b, Supplementary Tables 2, 3). These 25 correlations included: (i) local r g s between all 3 neurodegenerative diseases and SCZ; (ii) a local r g between PD and BIP; and (ii) 20 local r g s between AD and PD. For 30 of the 77 local r g s, the genetic signal of both disease traits may overlap entirely, as suggested by the upper limit of the 95% confidence interval (CI) for explained variance (i.e. r 2 , the proportion of variance in genetic signal of one disease trait in a pair explained by the other) including 1. Notably, the trait pairs where the upper limit of the 95% CI did not include 1 all involved at least one neurodegenerative disease, with the one exception being a local r g between PD and SCZ, suggesting that the genetic overlap between neurodegenerative diseases is smaller than between neuropsychiatric disorders in the tested LD blocks (Fig. 1c).
We found no overlap between local r g s from our study and local genetic associations reported using rho-HESS, a tool for local r g estimation 19 . Furthermore, we found no overlap between local r g s from our study and shared genetic loci identified using the conditional/conjunctional false discovery rate (FDR) approach 21,23,24 , a tool for the estimation of global polygenic overlap (Supplementary Note, Supplementary Table 4). We did, however, demonstrate an overlap between local r g s from our study and local r g s reported using LAVA in a study of 10 psychiatric disorders and 10 substance abuse phenotypes 18  Local analyses associate disease-implicated genomic regions with previously unrelated traits Across the 77 local r g s, 22 involved trait pairs where both traits had genome-wide significant single nucleotide polymorphisms (SNPs) overlapping the LD block tested, 35 involved trait pairs where one trait in the pair had genome-wide significant SNPs overlapping the LD block tested and 20 involved trait pairs where neither trait had genome-wide significant SNPs overlapping the LD block tested (Fig. 2a). Thus, despite the targeted nature of our approach (which biased analyses towards LD blocks that contain genome-wide significant GWAS SNPs), 71% of the detected local r g s linked genomic regions implicated by one of the six disease traits with seemingly unrelated disease traits.
Local r g analyses also highlighted relationships between neurodegenerative traits in regions containing well-known, disease-implicated genes, such as: (i) SNCA (implicated in monogenic and sporadic forms of PD 3,5 ) in LD block 681  We also noted a positive correlation between AD and PD in LD block 2128 (chr16:29,043,178-31,384,210), which contains the AD-associated KAT8 locus 26 and the PD-associated SETD1A locus 3 (of note, rare loss-of-function variants in SETD1A are associated with schizophrenia 40 ). Given concerns that UK Biobank (UKBB) by-proxy cases could potentially be misdiagnosed (particularly in AD 41 ), resulting in spurious r g s between AD and PD, we performed sensitivity analyses using GWASs for AD and PD that excluded UKBB by-proxy cases, the results of which indicated this was not the case ( Supplementary Fig. 3  Local heritability of Lewy body dementia in an APOEcontaining LD block is only partly explained by Alzheimer's disease and Parkinson's disease Eleven LD blocks were associated with >1 trait pair, of which 8 LD blocks had a trait in common across multiple trait pairs. In other words, the genetic component of one disease trait (the outcome trait) could be modelled using the genetic components of multiple predictor disease traits. To explore the independent effects of predictor traits on the outcome trait, as well as potential confounding between predictors, we applied local multiple regression.
A total of 14 multivariate models were run across all 8 LD blocks. In 2 of these models, all predictor traits were found to significantly (and by extension, independently) contribute to the local heritability of the outcome trait (Fig. 3a, Supplementary Table 6). For example, in the APOE-containing LD block 2351 (chr19:45,040,933-45,893,307), fitting a conditional model that included both AD and PD as predictor traits of LBD demonstrated that both independently contributed to the genetic signal of LBD.
In 4 models, only one predictor trait was significant, suggesting that one predictor may account for the relationship of the outcome trait with other non-significant predictors (Fig. 3a, Supplementary Table 6). In the remaining 8 models, all predictor traits were nonsignificant, despite significant bivariate correlations, which could indicate collinearity between predictors ( Supplementary Fig. 4). Examples of the latter two situations (i.e. 0 or 1 significant predictor trait) are given in the Supplementary Note.
We noted that all models with a neuropsychiatric outcome trait and neuropsychiatric predictor traits had a high multivariate r 2 (range: 0.53-1), with upper confidence intervals including 1 (Fig. 3b), suggesting that the genetic signal of the neuropsychiatric outcome trait could be entirely explained by its predictor traits in these LD blocks. In contrast, in LD block 2351, the multivariate r 2 was 0.43 (95% CI: 0.38 to 0.5), a result that held using GWASs for AD and PD that excluded by-proxy cases (r 2 = 0.49, 95% CI: 0.44 to 0.57; Supplementary Fig. 3). Thus, while AD and PD jointly explained approximately 40% of the local heritability of LBD, a proportion of the local heritability for LBD was independent of AD and PD. showing the number of traits within trait pairs demonstrating significant local r g s that had genome-wide significant SNPs overlapping the tested LD block (as illustrated by the schematic on the right). b Two LD blocks illustrating the situations depicted in (a). Edge diagrams for each LD block show the standardised coefficient for r g (rho, ρ) for each significant bivariate local r g . Significant negative and positive r g s are indicated by blue and red colour, respectively. c Heatmaps show the rho for each bivariate local r g within the LD block. Asterisks (*) indicate r g s that were replicated when using AD and PD GWASs that excluded UK Biobank by-proxy cases. Significant negative and positive r g s are indicated by blue and red fill, respectively. Non-significant r g s have a grey fill. In both (b, c) panels are labelled by the LD block identifier, the traits with genome-wide significant SNPs overlapping the LD block (indicated in the brackets) and the genomic coordinates of the LD block (in the format chromosome:start-end, GRCh37).

Incorporation of gene expression traits to facilitate functional interpretation of disease trait correlations
To dissect whether regulation of gene expression might underlie local r g s between disease traits, we performed local r g analyses using expression quantitative trait loci (eQTLs) from eQTLGen 42 and PsychENCODE 43 , which represent large human blood and brain expression datasets, respectively (Table 1). We used LAVA to study relationships between gene expression and disease traits on account of its ability to model the uncertainty in eQTL effect estimates (unlike the commonly used TWAS framework, which as a result, has an increased type 1 error rate 44 ). In addition, where three-way relationships were observed between 2 disease traits and an eQTL, we computed partial correlations to determine whether correlations between disease traits could be explained by the eQTL.
We restricted analyses to the 5 LD blocks highlighted in Fig. 2  interest to at least one of the disease traits implicated by local r g analyses. From these LD blocks of interest, we defined genic regions (gene start and end coordinates ± 100 kb) for all overlapping protein-coding, antisense or lincRNA genes (n = 92).
We detected a total of 135 significant bivariate local r g s across 47 distinct genic regions (FDR < 0.05), with 43 local r g s across 27 distinct genic regions between trait pairs involving a disease trait and a gene expression trait ( Supplementary Fig. 5, Supplementary Table 7). We noted that the explained variance (r 2 ) between trait pairs involving a disease trait and a gene expression trait tended to be lower than between trait pairs involving two disease traits ( Supplementary Fig. 6), an observation that aligns with a recent study that found only 11% of trait heritability to be mediated by bulk-tissue gene expression 45 .
With the exception of the SNCA-containing LD block 681 (chr4:90,236,972-91,309,863), where eQTLs for only 1 out of 5 genes tested in the block were correlated with a disease trait (negative r g between blood-derived SNCA eQTLs and PD), the expression of multiple genes was associated with disease traits across the remaining LD blocks (Fig. 4a). In addition, the  expression of several genes was associated with more than one disease trait (Fig. 4b). For example, blood-and brain-derived ANKK1 eQTLs (DRD2-containing LD block 1719, chr11:112,755,447-113,889,019) were negatively correlated with both MDD and SCZ, which themselves were positively correlated (Fig. 4c). A SNP residing in the coding region of ANKK1 (rs1800497, commonly known as TaqIA SNP) has been previously associated with alcoholism, schizophrenia and eating disorders, although it is unclear whether this SNP exerts its effect via DRD2 or ANKK1 46 . Conditioning the local r g between MDD and SCZ on ANKK1 eQTLs  Table 9). A high degree of eQTL sharing across disease traits was observed in the CLU-containing LD block 1273 (chr8:27,406,512-28,344,176), with blood-derived eQTLs from 5 out of the 6 genes implicated in local r g s found to correlate with both AD and PD (Fig. 4b, d). This included situations where eQTL-disease trait correlations had (i) the same direction of effect across both disease traits (as observed with PBK, PNOC and SCARA5) or (ii) opposing directions of effect across both disease traits (as observed with CLU and ESCO2) (Fig. 4d). Notably, while a significant positive local r g was observed between AD and PD in the PBK and SCARA5 genic regions (reflecting the positive local r g observed between AD and PD across the entire LD block), no local r g was observed between AD and PD in the CLU genic region, suggesting that the shared risk of AD and PD in LD block 1273 may be driven by the expression of genes other than the ADassociated CLU (Fig. 4e). Indeed, while conditioning the local r g between AD and PD on SCARA5 eQTLs had little effect on the strength and significance of the correlation (AD~PD, r g = 0.16, p = 0.000135; AD~PD |SCARA5, r g = 0.15, p = 0.000487; Supplementary Table 9), the local r g between AD and PD was weakened and no longer significant after conditioning on PBK eQTLs, indicating the PBK eQTLs may partly explain the local r g between AD and PD (AD~PD, r g = 0.14, p = 0.0187; AD~PD |PBK, r g = 0.07, p = 0.259; Supplementary Table 9).
Compared to LD block 1273, the degree of eQTL sharing across disease traits was lower in the APOE-containing LD block 2351 (chr19:45,040,933-45,893,307), with eQTLs from 4 out of 16 genes implicated in local r g s found to correlate with AD and one of PD or LBD (Fig. 4b, f). Shared eQTL genes were only observed in blood and included BCL3, CLPTM1, PVRL2 and TOMM40, with expression of BCL3 and CLPTM1 positively correlating with AD and PD and expression of PVRL2 and TOMM40 positively correlating with AD and LBD. As the exception, PVR eQTLs were negatively associated with both AD and PD albeit in different tissues: AD in brain and PD in blood. Expression of the remaining 11 genes was exclusively associated with either AD (n = 8) or PD (n = 3). No significant local r g was observed between APOE eQTLs and AD (FDR < 0.05), although a nominal positive r g was observed in blood (ρ = 0.178, ρ CI = 0.007 to 0.352, p = 0.039; Supplementary Fig. 5e, Supplementary Table 7). Overall, these results indicate that risk of neurodegenerative diseases (in particular, AD) is associated with expression of multiple genes in the APOE-containing LD block. Further, they add to a growing body of evidence suggesting that in parallel with the well-studied APOE-ε4 risk allele, there are additional APOE-independent risk factors in the region (such as BCL3 47 and PVRL2 48 ) that may contribute to AD risk.  Table 7.

DISCUSSION
Despite clinical and neuropathological overlaps between neurodegenerative diseases, global analyses of genetic correlation (r g ) show minimal r g among neurodegenerative diseases or across neurodegenerative and neuropsychiatric diseases. However, local r g s can deviate from the genome-wide average estimated by global analyses and may even exist in the absence of a genomewide r g , thus motivating the use of tools to model local genetic relations.
Here, we applied LAVA to 3 neurodegenerative diseases and 3 neuropsychiatric disorders to determine whether local r g s exist in a subset of 300 LD blocks that contain genome-wide significant GWAS loci from at least one of six investigated disease traits. We identified 77 significant bivariate local r g s across 59 distinct LD blocks, with 25 local r g s between trait pairs where no significant global r g was observed, including between (i) all 3 neurodegenerative diseases and SCZ and (ii) AD and PD. Local r g s highlighted expected associations (e.g. AD and LBD in the APOE-containing LD block 2351 5 , chr19:45,040,933-45,893,307) and putative new associations (e.g. AD and PD in the CLU-containing LD block 1273, chr8:27,406,512-28,344,176) in genomic regions containing well-known, disease-implicated genes. Likewise, incorporation of eQTLs confirmed known relationships between diseases and genes, such as the association of AD with CLU expression 38 and PD with SNCA expression in blood 49 , and revealed putative new disease-gene relationships. Together, these results indicate that more complex aetiological relationships exist between neurodegenerative and neuropsychiatric diseases than those revealed by global r g s. Further, they highlight potential gene expression intermediaries that may account for local r g s between disease traits.
These findings have important implications for our understanding of neurodegenerative diseases and the extent to which they overlap. An overlap between the synucleinopathies and AD is often acknowledged in the context of LBD, which has been hypothesised to lie on a disease continuum between AD and PD 5,34 . In support of this continuum, LBD was found to associate with both AD and PD in the APOE-containing LD block 2351 (chr19:45,040,933-45,893,307). Multiple regression analyses confirmed that AD and PD were significant predictors of LBD heritability in LD block 2351. Importantly, when AD and PD were modelled together, they explained only~40% of the local heritability of LBD in LD block 2351, implying that LBD represents more than the union of AD and PD. Further, the associations of AD and PD with LBD had opposing regression coefficients, suggesting that the contribution of AD and PD to LBD in the APOE locus may not be synergistic. This mirrors the observation that genome-wide genetic risk scores of AD and PD do not interact in LBD 5 , and may indicate that different biological pathways underlie the association Fig. 4 Incorporation of gene expression traits to facilitate functional interpretation of disease trait correlations. a Bar plot of the number of eQTL genes (as defined by their genic regions) tested in each LD block. The fill of the bars indicates whether eQTL genes were significantly correlated with at least one disease trait. b Bar plot of the number of eQTL genes that were significantly correlated with at least one disease trait. The fill of the bars indicates whether eQTL genes in local r g s were correlated with one or more disease traits. c, d, f Heatmaps of the standardised coefficient for r g (rho) for each significant gene expression-disease trait correlation (FDR < 0.05) within LD block (c) 1719, (d) 1273 and (f) 2351. Genes are ordered left to right on the x-axis by the genomic coordinate of their gene start. Panels are labelled by the eQTL dataset from which eQTL genes were derived (either PsychENCODE's analysis of adult brain tissue from 1387 individuals or the eQTLGen metaanalysis of 31,684 blood samples from 37 cohorts). e Edge diagrams for representative genic regions show the rho for each significant bivariate local r g (FDR < 0.05). GWAS and eQTL nodes are indicated by grey and white fill, respectively. Panels are labelled by the gene tested and the eQTL dataset from which eQTL genes were derived. In (c-f) significant negative and positive r g s are indicated by blue and red colour, respectively. Coordinates for LD blocks (in the format chromosome:start-end, between LBD and AD/PD. Indeed, only blood-derived PVRL2 and TOMM40 eQTLs were found to correlate with both AD and LBD, while no shared eQTL genes were detected between PD and LBD. Less acknowledged is the genetic overlap between AD and PD, with no global r g reported between the two diseases 16,50 and no significant evidence for the presence of loci that increase the risk of both diseases 51 . As the exception, genetic overlaps have been reported between AD and PD in the HLA 19 and MAPT loci 20 , hinting that pleiotropy may exist locally. In support of local pleiotropy, we observed 20 local r g s between AD and PD in genomic regions containing disease-implicated genes, such as SNCA (LD block 681, chr4:90,236,972-91,309,863) and CLU (LD block 1273, chr8: 27,406,344,176). In the case of the CLUcontaining LD block 1273 (chr8:27,406,512-28,344,176), incorporation of eQTLs demonstrated an association of AD and PD with the expression of 5 genes, although partial correlations suggested that only PBK expression could explain the correlation between AD and PD. PBK encodes a serine-threonine kinase involved in regulation of cellular proliferation and cell-cycle progression 52 , which has been shown to be overexpressed in proliferative cells, including neural precursors cells in the subventricular zone of the adult brain 52,53 . The remaining associations between eQTLs and AD or PD, which included an association between the ferritin receptor SCARA5 54 and both AD and PD, appeared to operate independently across diseases. Notably, cellular iron overload and ironinduced oxidative stress have been implicated in several neurodegenerative diseases such as AD and PD 54,55 . In contrast, only blood-derived SNCA eQTLs were associated with PD in LD block 681 (chr4:90,236,972-91,309,863), suggesting that the association between AD and PD at the SNCA locus could be driven by tissue-or context-dependent gene expression or alternatively other molecular phenotypes.
A few studies have demonstrated genetic overlaps between neurodegenerative and neuropsychiatric diseases, such as AD and BIP 21 , AD and MDD 22,23 , and PD and SCZ 24 , while others have demonstrated no overlap 16,56 , with divergences in outcomes ascribed to differences in methodology and cohort 22 . Here, we observed a local r g between BIP and PD, in addition to local r g s between schizophrenia and all 3 neurodegenerative diseases, which in the case of LBD was observed in an LD block containing the gene DRD2 (LD block 1719, chr11:112,755,447-113,889,019). Notably, parkinsonism in dementia with Lewy bodies (DLB, one of the two LBDs), is often less responsive to dopaminergic treatments than in PD 57 . Furthermore, methylation of the DRD2 promoter in leucocytes has been shown to differ between DLB and PD 58 , while D2 receptor density has been shown to be significantly reduced in the temporal cortex of DLB patients, but not AD 59 , suggesting that the DRD2 locus may harbour markers that could distinguish between these neurodegenerative diseases. Our study adds to the body of evidence in favour of a shared genetic basis between neurodegenerative and neuropsychiatric diseases, although further work will be required to determine whether this genetic overlap underlies the clinical and epidemiological links observed between these two disease groups.
This study is not without its limitations, with several limitations related to the input data. These limitations include: (i) the variability in cohort size (sample size is a key determinant of the power to detect the association of a variant with a trait), which in the case of the smallest GWAS, LBD, may explain the limited number of local r g s observed involving this trait; (ii) the risk of misdiagnosis (particularly in GWASs that include broader definitions of a disorder, such as the MDD GWAS, which includes the UK Biobank broad definition of depression as well as clinically-derived phenotypes for MDD); (iii) the lack of X chromosome in all but one trait (notably, the X chromosome is not only longer than chromosome 8-22, but according to Ensembl v106 60 encodes 858 and 689 protein-coding and non-coding genes, respectively); and (iv) the lack of genetic diversity (i.e. all traits used were derived from individuals of European ancestry). Given that population-specific genetic risk factors exist, such as the lack of MAPT GWAS signal in the largest GWAS of Asian patients with PD 61 , and that transethnic global r g s between traits such as gene expression are significantly less than 1 62 , it is imperative that studies of local r g are expanded to include diverse populations.
Among methodological limitations, analyses were restricted only to genomic loci with evidence of trait association. Exploring all genomic loci may show further loci of pleiotropy between conditions, but is beyond the scope of the current study. Furthermore, as mentioned by the developers of LAVA 31 , local r g s could potentially be confounded by association signals from adjacent genomic regions, a limitation which is particularly pertinent in our analysis of gene expression traits where LD blocks were divided into smaller (often overlapping) genic regions. Additional fine-mapping (both computational and biological) could be helpful in narrowing down the set of potentially causal variants and consequently the genomic regions of interest 63 .
Importantly, as with any genetic correlation analysis, an observed r g does not guarantee the presence of true pleiotropy. Spurious r g s can occur due to LD or misclassification 17 . Here, we attempted to address the potential misclassification of by-proxy cases via sensitivity analyses using GWASs for AD and PD that excluded UKBB by-proxy cases. We replicated 2 of the 3 significant local r g s observed in 2 LD blocks when using GWASs with byproxy cases (Supplementary Note). However, we were unable to test for local r g s across the remaining 19 LD blocks due to insufficient univariate signal, which could reflect (i) a genuine contribution of by-proxy cases to trait h 2 in the region or (ii) a lack of statistical power to detect a genetic signal. Given the substantial decrease in cohort numbers when UKBB by-proxy cases are excluded from AD and PD GWASs (Table 1), a lack of statistical power seems the more likely explanation, warranting a revisit of this analysis as clinically-diagnosed and/or pathologically-defined cohorts increase in size.
Finally, even where observed r g s potentially represent true pleiotropy, LAVA cannot discriminate between vertical and horizontal pleiotropy (refs. 17,31 ). Thus, while incorporation of gene expression can provide testable hypotheses regarding the underlying genes and biological pathways that drive relationships between neurodegenerative and neuropsychiatric diseases, experimental validation is required to establish the extent to which these genes represent genuine intermediary phenotypes.
In summary, our results have important implications for our understanding of the genetic architecture of neurodegenerative and neuropsychiatric diseases, including the demonstration of local pleiotropy particularly between neurodegenerative diseases. Not only do these findings suggest that neurodegenerative diseases may share common pathogenic processes, highlighting putative gene expression intermediaries which may underlie relationships between diseases, but they also infer the existence of common therapeutic targets across neurodegenerative diseases that could be leveraged for the benefit of broader patient groups.

METHODS
Trait pre-processing Summary statistics from a total of 8 distinct traits were used, including 6 disease traits and 2 gene expression traits. Disease traits included 3 neurodegenerative diseases (Alzheimer's disease, AD; Lewy body dementia, LBD; and Parkinson's disease, PD) and 3 neuropsychiatric disorders (bipolar disorder, BIP; major depressive disorder, MDD; and schizophrenia, SCZ) 3,5,[26][27][28][29][30] . Gene expression traits were used to facilitate functional interpretation of local genetic correlations (r g ) between disease traits. Gene expression traits included expression quantitative trait loci (eQTLs) from eQTLGen 42 and PsychENCODE 43 , which represent large human blood and brain expression datasets, respectively. All traits used were derived from individuals of European ancestry. Details of all summary statistics used can be found in Table 1.
Where necessary, SNP genomic coordinates were mapped to Reference SNP cluster IDs (rsIDs) using the SNPlocs.Hsapiens.dbSNP144.GRCh37 package 64 . In the case of the PD GWAS without UK Biobank (UKBB) data (summary statistics were kindly provided by the International Parkinson Disease Genomics Consortium), additional quality control filtering was applied, including removal of SNPs (i) with MAF < 1%, (ii) displaying an I 2 heterogeneity value of ≥80 and (iii) where the SNP was not present in at least 9 out of the 13 cohorts included in the metaanalysis.
Global genetic correlation analysis and estimation of sample overlaps Across disease trait pairs, LD score regression (LDSC) was used to (i) estimate the observed-scale SNP heritability of each trait (which assumes a continuous liability, and thus may differ from liabilityscale estimates of SNP heritability), (ii) determine the global r g and (iii) estimate sample overlap 65,66 . All disease traits had significant SNP-based heritability (Z-score > 2) and met with the criteria suggested for reliable estimates of genetic correlation, which include: (i) heritability Z-score > 1.5 (optimal > 4), (ii) mean Chi square of test statistics > 1.02, and (iii) intercept estimated from SNP heritability analysis is between 0.9 and 1.1 67 (Supplementary  Table 2). We note that the heritability Z-score of LBD was 2.27, which is below the optimal suggested, and as such, can be expected to produce larger standard errors around estimates of global r g .
Summary statistics for each trait were pre-processed using LDSC's munge_sumstats. The estimated sample overlap was used as an input for LAVA, given that potential sample overlap between GWASs could impact estimated local r g s 31 . Any shared variance due to sample overlap was modelled as a residual genetic covariance. As performed by Werme et al. 31 , a symmetric matrix was constructed, with offdiagonal elements populated by the intercepts for genetic covariance derived from cross-trait LDSC and diagonal elements populated by comparisons of each trait with itself. This symmetric matrix was then converted to a correlation matrix. Importantly, it is not possible to estimate sample overlap with eQTL summary statistics, but given that the cohorts used in the GWASs were different from the cohorts included in the eQTL datasets, we assumed sample overlap between GWASs and eQTL datasets to be negligible. Thus, they were set to 0 in the correlation matrix. However, given the inclusion of GTEx samples in both eQTL datasets and our inability to estimate this overlap, downstream LAVA analyses were performed separately for each eQTL dataset.
Defining genomic regions for local genetic correlation analysis Between disease traits. Genome-wide significant loci (p < 5 × 10 −8 ) were derived from publicly available AD, BIP, LBD, MDD, PD and SCZ GWASs. Genome-wide significant loci were overlapped with linkage disequilibrium (LD) blocks generated by Werme et al. 31 using a genome-wide partitioning algorithm. Briefly, each chromosome was recursively split into blocks using (i) a break point to minimise LD between the resulting blocks and (ii) a minimum size requirement. The resulting LD blocks represent approximately equal-sized, semi-independent blocks of SNPs, with a minimum size requirement of 2,500 SNPs (resulting in an average block size of around 1Mb). Only those LD blocks containing genome-wide significant GWAS loci from at least one trait were carried forward in downstream analyses, resulting in a total of 300 autosomal LD blocks. Of the 22 possible autosomes, 21 contained LD blocks with overlapping loci, with the highest number of LD blocks located in chromosome 1 and 6 ( Supplementary Fig. 1). LD block locations were in reference to build GRCh37 and are presented in the format: LD block identifier, chromosome:start-end.
Between disease and gene expression traits. A total of 5 LD blocks, as highlighted by bivariate local r g analysis of disease traits, were used in this analysis ( . From these LD blocks of interest, we defined genic regions for all protein-coding, antisense or lincRNA genes that overlapped an LD block of interest. Genic regions were defined as the start and end coordinates of a gene (Ensembl v87, GRCh37) with an additional 100 kb upstream and 100 kb downstream of gene start/end coordinates. We included a 100-kb window as most lead cis-eQTL SNPs (i.e. the SNP with the most significant p-value in a SNP-gene association) lie outside the gene start and end coordinates and are located within 100 kb of the gene (in eQTLGen, 55% of lead-eQTL SNPs were outside the gene body and 92% were within 100 kb from the gene 42 ). These genic regions (n = 92) were carried forward in downstream analyses. For a given genic region, we then used all SNPs for which eQTL summary statistics for the relevant gene were available (e.g. all SNP-gene pairs that relate to CLU in the CLU genic region).

Estimating bivariate local genetic correlations
Between disease traits. The detection of valid and interpretable local r g requires the presence of sufficient local genetic signal. For this reason, a univariate test was performed as a filtering step for bivariate local r g analyses. Bivariate local r g analyses were only performed for pairs of disease traits which both exhibited a significant univariate local genetic signal (p < 0.05/300, where the denominator represents the total number of tested LD blocks). This step resulted in a total of 1,603 bivariate tests spanning 275 distinct LD blocks. Bivariate results were considered significant when p < 0.05/1603.
We compared local r g s to existing results from studies of: (i) AD and PD using rho-HESS 19 ; (ii) AD and BIP 21 , AD and MDD 23 , PD and SCZ 24 , all of which used a conditional/conjunctional FDR approach (conditional FDR is an extension of the standard FDR method, and re-ranks the test statistics of a primary phenotype based on the strength of the association with a secondary phenotype, while conjunctional FDR is used post-hoc to identify shared genetic loci); and (iii) 10 psychiatric disorders and 10 substance abuse phenotypes using LAVA 18 . For all comparisons, genomic coordinates were used to overlap either SNPs 21,24 or genomic regions 18,19 with LD blocks. SNPs were converted from rsIDs to their GRCh37 genomic coordinates using the SNPlocs.Hsapiens.dbSNP144.GRCh37 package 64 . In all comparisons, local r g s from this study were filtered to include only overlapping disease traits. Results are described in the Supplementary Note.
Between disease and gene expression traits. For each genic region, only those disease traits that were found to have a significant local r g in the associated LD block were carried forward to univariate and bivariate analyses with eQTL summary statistics. As previously described, a univariate test was performed as a filtering step for distinct genic regions. Bivariate results were corrected for multiple testing using two strategies: (i) a more lenient FDR correction and (ii) a more stringent Bonferroni correction (p < 0.05/n_tests, where the denominator represents the total number of bivariate tests). We discuss results passing FDR < 0.05, but we make the results of both correction strategies available (Supplementary Table 7,  Supplementary Table 8).
We evaluated the effect of window size on bivariate correlations by re-running all analyses using a 50-kb window. Following filtering for significant univariate local genetic signal (as described above), a total of 267 bivariate tests were run spanning 50 distinct genic regions. We detected 110 significant bivariate local r g s (FDR < 0.05), 83 of which were also significant when using a 100kb window (Supplementary Fig. 7). We observed strong positive Pearson correlations in local r g coefficient and p-value estimates across the two window sizes, indicating that our results are robust to the choice of window size (Supplementary Fig. 7). Of note, p-value estimates between disease and gene expression traits tended to be lower when using the 50-kb window, as compared to the 100-kb window, as evidenced by the fitted line falling below the equivalent of y = x. This observation may be a reflection of stronger cis-eQTLs tending to have a smaller distance between SNP and gene 42 . In contrast, p-value estimates between two disease traits were comparable across the two window sizes.
Partial correlations were computed where three-way relationships were observed between 2 disease traits and an eQTL. The partial correlation reflects the correlation between 2 traits (e.g. disease X and Y) that can be explained by a third trait (e.g. eQTL, Z). Thus, a partial correlation approaching 0 suggests that trait Z captures an increasing proportion of the correlation between traits X and Y. Due to the three-way nature of the relationships, 3 possible conformations were possible (i.e. X~Y|Z, X~Z|Y and Y~Z| X); partial correlations were computed for all 3.

Local multiple regression
For LD blocks with significant bivariate local r g between one disease trait and ≥2 disease traits, multiple regression was used to determine the extent to which the genetic component of the outcome trait could be explained by the genetic components of multiple predictor traits. In those LD blocks where a three-way relationship was observed between 3 disease traits (e.g. X, Y and Z were all significantly correlated with one another), 3 possible conformations of 2 predictor models were possible (i.e. X~Y + Z, Y~X + Z, and Z~X + Y). In these situations, each disease trait was separately modelled as the outcome trait, resulting in 3 independent models within the LD block.
These analyses permitted exploration of the independent effects of predictor traits on the outcome trait, as well as possible confounding between predictors. A predictor trait was considered significant when p < 0.05. Sensitivity analysis using by-proxy cases As UK Biobank (UKBB) by-proxy cases could potentially be mislabelled (i.e. parent of by-proxy case suffered from another type of dementia) and lead to spurious r g s between neurodegenerative traits, we performed replication analyses using GWASs for AD 27 and PD that excluded UKBB by-proxy cases. LD blocks were filtered to include only those where significant bivariate local r g s were observed between LBD and either by-proxy AD or byproxy PD GWASs, in addition to between by-proxy AD and byproxy PD GWASs. These criteria limited the number of LD blocks to 21. Bivariate local correlations were only performed for pairs of traits which both exhibited a significant univariate local genetic signal (p < 0.05/21, where the denominator represents the total number of tested loci), which resulted in a total of 10 bivariate tests spanning 6 distinct loci. Results are described in the Supplementary Note. We additionally performed multiple regression in LD block 2351 using LBD as the outcome and AD and PD (both excluding UKBB by-proxy cases) as predictors. A predictor trait was considered significant when p < 0.05.

Reporting summary
Further information on research design is available in the Nature Research Reporting Summary linked to this article.

DATA AVAILABILITY
Analyses in this study relied on publicly available data, all of which are listed in Box 1. In the case of the PD GWAS without UK Biobank (UKBB) data, summary statistics were kindly provided by the International Parkinson Disease Genomics Consortium: https://pdgenetics.org/.

CODE AVAILABILITY
Code used to pre-process GWASs, run genetic correlation analyses and to generate figures for the manuscript are available at: https://github.com/RHReynolds/ neurodegen-psych-local-corr (https://doi.org/10.5281/zenodo.6587707). All other open-source software used in this paper is listed in Box 1.